Skip to content

Conversation

@ludfjig
Copy link
Contributor

@ludfjig ludfjig commented Oct 2, 2025

This PR adds a poison state to sandbox in order to prevent further operations when the sandbox is left in an inconsistent state that could compromise memory safety, data integrity, or security. The sandbox becomes poisoned when guest functions abort/panic or when host-initiated execution cancellation occurs, leaving behind leaked heap allocations, corrupted data structures, or unreleased resources. For example, interrupting execution while guest is allocating can leave the global allocation lock in an inconsistent state, making future allocations impossible in subsequent runs due to infinite locking/spinning.

Poisoned sandboxes will reject all further operations (guest calls, snapshots, memory mapping) until the inconsistent state is resolved through either restoring to a snapshot or manually (unsafely) clearing the poison state.

Closes #848

Round 2:

I've changed the docs and implementation slightly. is_poison_error has been made private. Removed unsafe way of clearing poision. Users of hyperlight are expected on-error to check if the sandbox is poisoned, and deal with it accordingly, for example.

let result = sandbox.call::<()>("guest_panic", ());
if result.is_err() {
    if sandbox.poisoned() {
        // Restore from snapshot to clear poison
        sandbox.restore(&snapshot)?;
        assert!(!sandbox.poisoned());
        
        // Sandbox is now usable again
        sandbox.call::<String>("Echo", "hello".to_string())?;
    }
}

@ludfjig ludfjig added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Oct 2, 2025
@ludfjig ludfjig marked this pull request as ready for review October 2, 2025 19:15
@ludfjig ludfjig requested a review from Copilot October 2, 2025 20:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces sandbox poisoning functionality to prevent further operations when a sandbox is in an inconsistent state that could compromise memory safety. The sandbox becomes poisoned when guest functions abort/panic or when host-initiated execution cancellation occurs.

Key changes:

  • Added poisoned state tracking with safety checks across all sandbox operations
  • Implemented automatic poison detection for specific error types (GuestAborted, ExecutionCanceledByHost)
  • Added recovery mechanisms through snapshot restoration or manual poison clearing

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/hyperlight_host/src/sandbox/initialized_multi_use.rs Core implementation of poison state tracking, safety checks, and recovery mechanisms
src/hyperlight_host/src/error.rs Added PoisonedSandbox error variant with detailed documentation
src/hyperlight_host/tests/integration_test.rs Updated interrupt tests to clear poison state for continued execution
src/hyperlight_host/src/sandbox/snapshot.rs Added Debug trait to Snapshot struct
src/hyperlight_host/src/mem/shared_mem_snapshot.rs Added Debug trait to SharedMemorySnapshot struct

dblnz
dblnz previously approved these changes Oct 23, 2025
Copy link
Contributor

@dblnz dblnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!
Changes LGTM. I left some clarifying comments, but overall good to go.

@ludfjig ludfjig force-pushed the poison branch 2 times, most recently from bca27e6 to 50f0c0a Compare October 23, 2025 23:06
Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, a few nits but happy to see what other say

jprendes
jprendes previously approved these changes Oct 27, 2025
Copy link
Contributor

@jprendes jprendes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!
Thanks @ludfjig !

@ludfjig
Copy link
Contributor Author

ludfjig commented Oct 27, 2025

I am all for giving the user control, however I don't see a scenario where a user could resolve this without calling restore or creating a new sandbox, what did you have in mind?

Not exactly sure. Maybe some usecase for completely state-less guests? Not sure tbh.

@simongdavies
Copy link
Contributor

I am all for giving the user control, however I don't see a scenario where a user could resolve this without calling restore or creating a new sandbox, what did you have in mind?

Not exactly sure. Maybe some usecase for completely state-less guests? Not sure tbh.

No idea why this doesn't show up as a review comment, anyway my instinct is not to add this now and then if someone wants it add it later, seems like its safer that way

@ludfjig ludfjig force-pushed the poison branch 2 times, most recently from 617adae to 7cb8002 Compare October 27, 2025 22:12
@ludfjig ludfjig merged commit f3acc2e into hyperlight-dev:main Oct 27, 2025
49 checks passed
ludfjig added a commit to ludfjig/hyperlight that referenced this pull request Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider adding some kind of "poisoned sandbox" state to prevent sandbox misuse

5 participants